[WIP][Java] Use MemorySegment for index serialization#1035
[WIP][Java] Use MemorySegment for index serialization#1035ldematte wants to merge 14 commits intorapidsai:branch-25.08from
Conversation
…c Dataset benchmarks
|
On the C side things are pretty simple: the C++ API already has an overload accepting a I'd like to refine this further; ATM the C API user (including our Java bindings) needs to "guess" a "large enough" buffer to pass to cuvs/cpp/src/neighbors/detail/dataset_serialize.hpp Lines 45 to 67 in a572273 it is possible to know the total size of the data before even serializing it (and with no need to grab data from the GPU too). So I was thinking we may add functions to the C and C++ API to get the serialization size. Something like Wdyt? |
FYI, I was trying to chase this exact optimization, by doing a direct serialization. I've just opened a draft PR #1071, where I was passing an existing file, and have cuVS append the index to it. After the file has been written to, checking the file size would give us the offset. |
I thought about that, but it looked a bit of a "hack" to me and I preferred this approach. Let me elaborate.
I think 4. is the closest to a ostream but it's more complex; 1 and 2 are "weird" in a API IMO; 1 would also require re-opening on the C side something that might also already (exclusively) open for write on the Java side (by ES or Lucene); 2 could avoid that but dealing with file descriptors is difficult/impossible to do in Java. Also, 1 and 2 limit the functionality of the API to files (3/4 are more generic). That's why I landed with 3 for this draft - it was generic enough and easy to implement. |
…segment-serialize # Conflicts: # java/cuvs-java/src/main/java/com/nvidia/cuvs/CagraIndex.java # java/cuvs-java/src/main/java22/com/nvidia/cuvs/internal/BruteForceIndexImpl.java # java/cuvs-java/src/main/java22/com/nvidia/cuvs/internal/CagraIndexImpl.java # java/cuvs-java/src/main/java22/com/nvidia/cuvs/internal/DatasetImpl.java # java/cuvs-java/src/main/java22/com/nvidia/cuvs/spi/JDKProvider.java # java/cuvs-java/src/test/java/com/nvidia/cuvs/CagraBuildAndSearchIT.java
|
Closed in favour of #1085 |
NOTE: this WIP/draft uses code from #1033 and #1024 -- don't worry, I'll sort out the merge order of what we want/can merge and update accordingly.
This PR adds a new method to serialize indices (for now, only Cagra -- for demonstration purposes) to memory, instead of relying on a separate temp file. This is cleaner, saves 50% of disk space, and it's up to 10x faster.
The PR includes both the necessary Java changes and the related C API changes -- again, this draft is for discussion and to showcase how this could work; if we like the approach, my plan is to raise separate C and Java PRs (first get the C API change merged, then consume/use it in the Java API).
If we like the approach, a similar change would be needed for deserialization (
CagraIndex#build/cuvsCagraDeserialize).Benchmarks: